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Abstract 


The  overall  long-term  objective  of  our  research  is  to  develop  mathematical  and  soft¬ 
ware  infrastructure  in  support  of  post-genomics  research  in  systems  biology.  One 
near-term  objective  articulated  in  this  abstract  centers  on  a  deeper  understanding  of 
the  organizational  principles  of  biological  networks.  A  distinguishing  theme  of  this 
work  is  its  focus  on  scalable  methods  of  robustness  and  model  (in)validation  with 
data,  as  opposed  to  relying  purely  on  simulation.  In  computability  terms,  if  sim¬ 
ulation  is  viewed  as  a  way  to  attack  the  NP  hard  side  of  biological  problems,  our 
approach  attacks  the  coNP  side.  Much  of  the  success  of  reductionist  biology  has  de¬ 
pended  on  creative  individuals  who  draw  biologically  meaningful  inferences  from  data 
and  computation  using  small  scale  and  informal  reasoning.  This  type  of  inference  was 
critical  because  the  reductionist  research  program  itself  offered  no  systematic  tools  to 
deal  with  complexity,  only  with  the  component  parts.  Far  from  being  dispensed  with, 
this  reasoning  process  and  its  biological  content  must  be  both  formalized  and  made 
rigorous,  systematic,  and  scalable  as  well,  and  ultimately  teachable.  This  requires 
the  development  of  new  mathematics  as  well  as  algorithms  and  software. 

A  central  goal  of  modeling  and  simulation  is  to  connect  molecular  mechanisms  to 
network  function  to  questions  of  biomedical  relevance.  Unfortunately,  many  of  the 
most  critical  questions  involve  events  which  are  extremely  rare  at  the  individual  cell 
level  where  the  mechanisms  act  yet  catastrophic  to  the  organism.  Thus  simulation 
methods  that  may  be  adequate  for  studying  generic  or  typical  behavior  are  entirely 
inadequate  to  explore  such  worst-case  scenarios,  which  with  conventional  methods  are 
computational  intractable.  We  are  extending  the  best-practice  tools  and  algorithms 
for  robustness  analysis  that  have  become  standards  in  engineering  to  models  of  bi¬ 
ological  relevance,  which  are  typically  nonlinear,  hybrid,  uncertain,  and  stochastic. 
This  includes  integrating  formal  inference  methods  from  the  previously  fragmented 
theories  in  Computer  Science  with  those  of  Control  and  Dynamical  Systems.  This 
involves  deep  mathematical  challenges  that  parallel  those  for  technological  networks, 
for  which  we  have  made  dramatic  progress,  and  on  which  we  are  building  new  tools 
for  systems  biology. 
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1  General  Introduction  and  Background 


Biological  networks  connect  devices  of  enormous  complexity  and  sophistication  even 
at  their  molecular  level  into  modular  components  for  sensing,  signal  processing,  com¬ 
munication,  computation,  and  actuation.  These  components  are  further  integrated 
into  vast  regulatory  networks  with  layers  of  feedback.  No  one  doubts  the  vast  range 
of  capabilities  that  a  deep  understanding  of  this  biological  complexity  would  enable, 
but  beyond  the  need  for  improved  experimental  technology,  and  sophisticated  bion- 
formatics  to  manage  the  data  such  improvements  would  continue  to  yield,  there  is 
little  consensus  as  to  exactly  what  further  must  be  done.  We  claim  that  the  over¬ 
whelmingly  greatest  source  of  complexity  and  the  least  current  understanding  lies 
in  the  signaling,  communications,  and  computation  modules,  and  even  more  so  in 
the  feedback  control  systems  that  they  comprise.  The  solution  to  the  challenge  of 
biological  complexity  may  not  be  to  fundamentally  change  the  way  in  which  current 
molecular  biology  research  is  done  so  much  as  to  augment  this  research  with  a  new 
way  of  thinking  about  the  systems  integration  issue  itself. 

Despite  its  enormous  success,  the  reductionist  program  provides  a  poor  foundation 
for  many  new  technical  challenges.  For  example,  the  ubiquitous  connectivity  and 
flexibility  of  the  Internet  as  observed  by  the  user  is  taken  for  granted,  as  are  the 
wires,  chips,  and  displays  that  make  up  the  hardware,  but  it  is  rare  for  nonexperts  to 
be  aware  of  the  complex  layers  of  protocols  and  feedback  regulation  that  makes  the 
Internet’s  flexibility  and  robustness  possible.  Until  recently,  there  has  been  limited 
theoretical  support  for  the  study  of  the  systems-level  challenges  in  either  internet¬ 
working  or  biology,  and  limited  academic  research.  Nevertheless,  for  some  time  there 
has  been  a  widely  shared  vision  there  could  be  universal  features  of  complex  systems 
that  can  transcend  these  reductionist  decompositions  [1,  2,  3],  and  provide  a  unifying 
integration.  Sharp  differences  have  arisen  however  with  regard  to  exactly  what  those 
features  are.  We  believe  there  is  now  a  clear,  compelling,  and  coherent  path  emerging 
from  the  striking  convergence  of  the  three  research  themes  of  biology,  technology,  and 
mathematics. 

First,  biologists  have  provided  a  detailed  description  of  the  components  of  biolog¬ 
ical  networks,  and  many  organizational  principles  of  these  networks  are  becoming 
increasingly  apparent.  Second,  advanced  information  technologies  have  enabled  en¬ 
gineering  systems  to  approach  biology  in  their  complexity.  We  are  developing  new 
theories  that  elucidate  these  similarities  that  are  comparable  in  depth  and  richness 
with  those  available  for  more  traditional  subdisciplines.  While  these  share  with  their 
traditional  counterparts  many  of  the  domain-specific  assumptions  that  overcome  the 
intractability  of  more  general  formulations,  this  progress  has  sharpened  the  mathe¬ 
matical  questions  that  are  relevant  to  these  important  application  domains.  Thus 
we  have  the  beginnings  of  the  first  coherent,  complete  theoretical  foundation  of  the 
Internet  [4,  5,  6,  7,  8,  9],  and  have  also  been  developing  new  theory  and  software 
infrastructure  to  support  systems  biology  [10,  11,  12,  1,  13,  14].  We  are  making  rigor- 


ous  and  precise  the  notion  that  this  apparent  network-level  evolutionary  convergence 
within  and  between  biology  and  technology  is  not  accidental,  but  follows  necessarily 
from  the  universal  requirements  of  efficiency  and  robustness. 

While  the  full  consequences  of  the  claimed  convergence  emerging  from  these  two 
areas  will  take  years  to  be  fully  resolved,  an  important  message  is  now  clear.  The 
method  of  decomposing  complex  systems  into  vertical  layers  of  varying  complexity 
and  scale,  wherein  each  layer  is  further  decomposed  horizontally  into  modules,  appears 
to  be  not  only  ubiquitous  but  necessary.  It  is  neither  an  accident  of  evolution  nor 
merely  an  artificial  construct  imposed  by  humans  to  make  biology  and  technology 
comprehensible,  although  that  may  be  a  wonderfully  serendipitous  side-effect.  Thus 
we  do  not  advocate  abandoning  the  reductionist  program  of  decomposing  complexity, 
but  in  managing  the  process  more  consciously  and  systematically.  The  disciplinary 
decompositions  that  exist  may  indeed  be  historical  artifices,  but  the  need  for  such 
decompositions  is  not.  The  key  to  creating  an  integrated  approach  to  understanding, 
exploiting,  and  mimicking  biological  complexity  is  not  to  replace  existing  technologies, 
but  to  augment  them  with  a  more  flexible  and  rigorous  technology  for  decomposition 
and  recomposition. 

Finally,  the  mathematical  foundation  is  being  developed  for  a  far  more  unified  the¬ 
ory  of  complex  systems  that  overcomes  the  intractability  that  forced  the  disciplinary 
fragmentation  in  the  first  place,  and  this  is  the  most  important  development  for  this 
project.  It  is  in  retrospect  unsurprising  that  a  genuinely  new  science  of  complexity, 
particularly  biological,  would  require  equally  new  mathematics  to  answer  basic  uni¬ 
versal  questions  such  as:  Is  a  model  consistent  with  experimental  data,  which  may 
come  from  extremely  heterogeneous  sources?  If  so,  is  it  robust  to  additional  pertur¬ 
bations  that  are  plausible  but  untested?  Are  different  models  at  multiple  scales  of 
resolution  consistent?  What  is  the  most  promising  experiment  to  refute  or  refine  a 
model?  These  questions  are  all  naturally  nonlinear,  nonequilibrium,  uncertain,  hybrid 
and  so  on,  and  their  analysis  has  relied  mainly  on  simulation.  Unfortunately,  simula¬ 
tion  alone  is  inadequate.  One  computer  simulation  produces  one  example  of  one  time 
history  for  one  set  of  parameters  and  initial  conditions.  Thus  simulations  can  only 
ever  provide  counterexamples  to  hypotheses  about  the  behavior  of  a  complex  system, 
and  can  never  provide  proofs.  (In  technical  terms,  they  can  in  principle  provide  sat¬ 
isfactory  solutions  to  questions  in  NP,  but  not  to  questions  in  coNP.)  Simulations 
can  never  prove  that  a  given  behavior  or  regularity  is  necessary  and  universal;  they 
can  at  best  show  that  a  behavior  is  generic  or  typical.  What  is  needed  is  an  effective 
(and  scalable)  method  for,  in  essence,  systematically  proving  robustness  properties 
of  nonlinear  dynamical  systems.  The  possibility  of  such  a  thing  (especially  without 
P— NP=coNP  in  computational  complexity  theory)  is  profound  and  remarkable,  and 
it  is  the  foundation  of  our  approach. 


1.1  The  Organization  of  Biological  Networks 


One  of  our  goals  is  to  develop  a  theory  of  biological  organization  that  exploits  the 
features  of  evolution  and  robustness  to  constrain  the  search  spaces  in  our  analysis 
algorithms.  Specifically,  our  computational  methods  for  modeling  from  data,  simula¬ 
tion,  and  robustness  analysis  need  not  solve  arbitrary  unstructured  problems,  which 
are  certainly  intractable,  but  only  those  that  are  biological  meaningful.  Biological  sys¬ 
tems  at  every  level  of  organization  are  highly  structured,  far  from  equilibrium,  persist 
there  robustly  despite  fluctuations  in  their  environment  and  their  components,  and 
have  evolved  to  this  highly  organized  state.  This  places  constraints  on  biological 
organization  that  has  some  parallels  in  technology  but  none  in  the  other  sciences. 
Algorithms  that  exploit  this  organization  can  be  almost  arbitrarily  more  efficient  and 
reliable  than  those  that  do  not,  but  it  requires  a  rigorous  theory  to  connect  the  ro¬ 
bustness  and  evolvability  of  biological  networks,  with  algorithms  for  modeling  and 
system  identification,  analysis,  and  simulation.  All  of  our  results  so  far  are  extremely 
encouraging,  but  are  merely  the  beginning  of  what  we  believe  is  possible. 

Our  main  effort  on  organizational  principles  is  to  identify  the  features  of  biological 
networks,  as  opposed  to  arbitrary  sets  of  chemical  reactions,  that  make  automatic 
and  scalable  computational  methods  feasible,  even  when  the  computational  complex¬ 
ity  classes  are  worst-case  exponential  or  worse.  A  smaller  effort  has  been  focused 
on  additionally  elucidating  organizational  principles  to  provide  greater  understand¬ 
ing  of  biological  complexity.  We  want  to  help  answer  the  question  “what  is  all  this 
complexity  for?”  [15].  This  will  have  a  huge  impact  on  computation,  but  the  results 
can  go  beyond  that.  In  particular,  our  program  could  be  viewed  as  thinking  of  bi¬ 
ological  networks  as  a  kind  of  technological  network  built  on  the  physical  substrate 
of  biochemistry,  as  opposed  to,  say,  the  CMOS  VLSI  and  fiber  optics  of  the  Inter¬ 
net.  Biological  networks  integrate  controls,  communications,  and  computing  in  a  way 
that  engineers  are  just  beginning  to  understand  in  a  deeply  theoretical  way,  and  we 
have  had  great  success  on  the  forefront  of  those  efforts.  By  explicitly  connecting 
the  theoretical  challenges  in  advanced  technological  and  biological  networks,  there  is 
the  promise  for  substantial  synergy,  and  there  is  strong  evidence  already  that  this 
approach  will  bring  novel  insights  to  both  areas  [16,  17,  18,  19,  20,  21,  22], 


1.2  Robust  yet  fragile  systems 

An  emphasis  on  scalable  and  provably  correct  analysis  methods  is  not  just  for  math¬ 
ematical  completeness,  but  is  driven  by  a  ubiquitous  property  of  complex  engineering 
and  natural  networked  systems:  they  are  robust  yet  fragile  (RYF).  Complex  networks 
can  provide  remarkable  robustness  despite  large  perturbations  in  their  environments 
and  component  parts,  but  they  can  also  be  extremely  fragile  to  cascading  failure 
events  triggered  by  relatively  small  perturbations.  We  experience  various  illnesses, 


crashes  due  to  software  bugs,  viruses,  worms,  and  denial-of-services  attacks,  power 
glitches,  security  screenings,  etc,  as  annoying  but  rarely  catastrophic.  Typically,  our 
networks  protect  us.  But  cancer  and  other  epidemics,  chronic  auto-immunity,  market 
crashes,  terrorist  attacks,  large  power  outages  and  fires,  etc,  remind  us  that  our  com¬ 
plexity  has  a  price.  Indeed,  most  dollars  and  lives  lost  in  natural  and  technological 
disasters  happen  in  the  few  largest  events,  while  the  typical  event  is  so  small  as  to 
usually  go  unreported. 

Many  current  military  technical  visions  convincingly  suggests  that  network  complex¬ 
ity  can  provide  robustness  and  efficiency  that  ultimately  greatly  exceeds  that  of  com¬ 
parable  brute  force  approaches.  The  ultimate  challenge  will  not  be  to  make  this  ap¬ 
parent  in  demonstrations  and  typical  scenarios,  but  to  avoid  the  rare  but  catastrophic 
failures  that  seem  to  inevitable  accompany  new  levels  of  complexity.  Unfortunately, 
the  entire  scientific  enterprise  of  experimentation,  modeling,  and  simulation  of  com¬ 
plex  systems  has  been  most  successful  at  studying  their  typical  or  generic  behaviors. 
Thus  it  should  be  no  surprise  that  the  rigorous  study  of  the  fragility  of  complex 
systems  would  require  new  methods. 

That  the  intrinsically  “robust  yet  fragile”  (RYF)  nature  of  complex  systems  [23,  24, 
2,  17,  3,  25]  has  the  computational  counterpart  of  “dual  complexity  implies  primal 
fragility”  is  a  key  feature  of  our  approach.  Practically  speaking,  this  completely 
changes  what  is  possible  computationally.  Organisms,  ecosystems,  and  successful 
advanced  technologies  are  highly  constrained  in  that  they  are  not  evolved/designed 
arbitrarily,  but  necessarily  in  ways  that  are  robust  to  uncertainties  in  their  environ¬ 
ment  and  their  component  parts.  These  are  extremely  severe  constraints,  not  present 
in  other  sciences  but  essential  in  both  biology  and  engineering.  The  most  obvious  fea¬ 
ture  is  that  their  macroscopic  system  properties  can  be  both  extremely  robust  to  most 
microscopic  details  yet  hyper-fragile  to  a  few,  and  this  must  shape  both  modeling  and 
analysis,  and  the  experimental  process  that  it  interacts  with.  If  most  details  don’t 
matter,  most  experiments  are  relatively  uninformative.  If  a  few  details  are  crucial, 
then  this  is  where  both  modeling  and  experiments  must  focus,  but  neither  a  purely 
top-down  nor  bottom-up  approach  can  reliably  find  them. 

Thus  failure  to  explicitly  exploit  the  highly  structured,  organized,  and  “robust  yet 
fragile”  nature  of  such  systems  hopelessly  dooms  any  method  to  be  overwhelmed  by 
their  sheer  complexity.  Technically  speaking,  we  can  now  formulate  a  wide  range 
of  questions  for  very  general  dynamical  systems  under  a  common  Lyapunov-type 
umbrella,  converting  them  into  statements  involving  semi-algebraic  sets,  polynomial 
(nonlinear)  equations  and  inequalities.  Proving  such  statements  is  still  coNP-hard, 
but  real  algebraic  geometry,  semi-definite  programming,  and  duality  theory  from 
optimization  provide  new  methods  to  systematically  exhaust  coNP  by  searching  for 
nested  families  of  short  proofs  using  convex  relaxations.  Not  only  can  we  search  for 
short  proofs  systematically,  but  a  lack  of  short  proofs  implies,  by  a  generalization  of 
duality,  intrinsic  fragilities  in  the  question  itself.  This  feedback  from  computation  to 
modeling  does  not  imply  P— NP=coNP,  which  is  unlikely,  but  rather  that  inference 


problems  within  coNP  lacking  short  proofs  can  be  traced  to  specific  and  meaningful 
flaws  in  models  or  data  for  which  resolution  can  then  be  systematically  pursued.  Note 
that  this  is  a  radical  broadening  of  the  numerical  analysts  notion  of  ill-conditioning, 
and  involves  mathematics  from  a  variety  of  previously  unrelated  disciplines.  Again, 
in  retrospect,  this  should  not  be  surprising,  but  it  creates  enormous  challenges  in 
both  education  and  the  review  process. 

Though  this  is  all  very  new,  these  methods  have  already  found  substantial  applications 
in  networking,  biology,  physics,  dynamical  systems,  controls,  algorithms,  and  finance 
[26,  27,  28,  29,  30,  31,  32,  26,  33,  34,  35,  36,  37,  38,  39,  40,  41,  42,  43],  and  work  on 
connections  with  communications  theory  is  in  progress  and  discussed  in  the  technical 
details.  A  side  benefit  of  a  deepening  understanding  of  the  fundamental  nature  of 
complexity  in  a  general  sense  is  also  a  new  and  more  rigorous  explanations  for  long¬ 
standing  problems  in  physics  associated  with  complex  systems[24,  29,  40,  41,  42,  43, 
44,  45,  29,  46,  47]. 


1.3  Robust  and  Scalable  Validation  of  Models  Against  Data 

Simulation  will  always  be  a  workhorse  of  systems  biology,  but  it  can  be  enhanced  sub¬ 
stantially  if  conjectures  formulated  using  simulation  can  be  proved  rigorously.  The 
linchpin  of  our  proposed  modeling  system  is  the  development  and  implementation 
of  theoretically-sound  methods  for  model  validation.  Although  some  existing  soft¬ 
ware  tools  provide  mechanisms  for  comparing  a  model’s  behavior  to  experimental 
data  (e.g.,  Gepasi  [48,  49]),  the  methods  used  to  date  have  been  ad-hoc,  brute-force 
approaches  that  do  not  scale  to  larger  models.  The  theoretical  framework  described 
later  in  this  document  represents  an  unprecedented  opportunity  to  create  a  system  for 
model  analysis  and  validation  and  iterative  experimentation  for  large-scale,  stochas¬ 
tic,  nonlinear,  nonequilibrium,  mixed  continuous  and  discrete  models  with  multiple 
time  and  spatial  scales.  The  remarkable  quality  of  the  theory  is  that  it  can  be  used 
to  prove  conjectures  for  this  difficult  class  of  models  such  as  “this  model  cannot  fit 
the  data,  no  matter  what  parameters  we  use”  and  “this  model  is  robust  no  matter 
how  parameters  are  varied.”  This  is  something  that  previously  has  not  been  possi¬ 
ble  except  for  much  simpler  models.  Yet  this,  together  with  sophisticated  robustness 
analysis  methods,  is  exactly  the  capability  needed  in  order  to  allow  realistic  biological 
models  to  be  analyzed  and  related  back  to  the  experimental  data  to  help  answer  the 
question,  “What  is  the  next  experiment  that  would  best  differentiate  between  the 
current  alternative  hypotheses?” 

We  are  relying  on  SOSTOOLS  as  a  founcation  for  the  system  identification  and 
parameter  estimation  research,  however  this  reliance  is  less  than  it  might  appear.  Our 
SOS/SDP  framework  actually  recovers  as  special  cases  essentially  all  of  the  standard 
methods,  so  the  worst-case  scenario  is  that  it  merely  provide  an  integrated  and  unified 
method  to  access  what  might  otherwise  appear  to  be  quite  disparate  methods.  This 


is  not  an  aspect  of  our  methods  that  we  emphasize  but  it  is  an  important  element 
in  our  optimism  about  their  potential.  Perhaps  more  important  is  the  converse, 
that  we  are  suspicious  of  new  methods  that  cannot  capture  gold-standard  methods 
already  in  existence.  Another  important  issue  is  that  the  SOS/SDP  methods  are 
the  only  candidates  for  a  successor  to  linear  programs  in  providing  all  the  features 
of  automation,  scalability,  duality,  structure,  and  fragility  to  hard  problem  classes 
involving  stochastic,  hybrid,  and  nonlinear  dynamical  systems,  in  addition  to  reducing 
to  linear  programs  in  the  special  cases  when  it  applies.  These  two  features,  plus  the 
implementation  in  a  MATLAB  toolbox,  makes  SOSTOOLS  unique.  It  also  has  the 
benefit  that  a  large,  diverse,  and  sophisticated  research  community  spanning  control 
and  dynamical  systems,  hybrid  systems,  optimization,  and  many  areas  of  pure  and 
applied  mathematics  has  recently  begun  to  focus  substantial  attention  in  this  area. 


2  Technical  Background 

2.1  Sum  of  Squares  Programming 

Consider  a  given  system  of  polynomial  equations  and  inequalities,  for  instance: 

fi(x  i,x2):=  xj  +  xij-l  =  0, 

gi(xi,x2)  :=  3  x2  —  xj  —  2  >  0,  (1) 

9i{x  i,x2):=  xi-8x2>0. 

How  can  one  find  real  solutions  (xi,  x2)?  How  to  prove  that  they  do  not  exist?  And 
if  the  solution  set  is  nonempty,  how  to  optimize  a  polynomial  function  over  this  set? 

Until  a  few  years  ago,  the  default  answer  to  these  and  similar  questions  would  have 
been  that  the  possible  nonconvexity  of  the  feasible  set  and/or  objective  function 
precludes  any  kind  of  analytic  global  results.  Even  today,  the  methods  of  choice 
for  most  practitioners  would  probably  employ  mostly  local  techniques  (Newton’s  and 
its  variations),  possibly  complemented  by  a  systematic  search  using  deterministic  or 
stochastic  exploration  of  the  solution  space,  interval  analysis  or  branch  and  bound. 

However,  very  recently  there  have  been  renewed  hopes  for  the  efficient  solution  of 
specific  instances  of  this  kind  of  problems.  The  main  reason  is  the  appearance  of 
methods  that  combine  in  a  very  interesting  fashion  ideas  from  real  algebraic  geometry 
and  convex  optimization  [50,  27,  51].  As  we  will  see,  these  methods  are  based  on  the 
intimate  links  between  sum  of  squares  decompositions  for  multivariate  polynomials 
and  semidefinite  programming  (SDP). 

We  will  outline  the  essential  elements  of  this  new  research  approach  as  introduced  in 


[27,  52].  The  centerpieces  will  be  the  following  two  facts  about  multivariate  polyno¬ 
mials  and  systems  of  polynomials  inequalities: 

1.  Sum  of  squares  decompositions  can  be  computed  using  semidefinite  program¬ 
ming. 

2.  The  search  for  infeasibility  certificates  is  a  convex  problem.  For  bounded  degree, 
it  is  an  SDP. 

We  will  define  the  basic  ideas  needed  to  make  the  assertions  above  precise,  and  explain 
the  relationship  with  earlier  techniques.  For  this,  we  will  introduce  sum  of  squares 
polynomials  and  the  notion  of  sum  of  squares  programs.  We  then  explain  how  to  use 
them  to  provide  infeasibility  certificates  for  systems  of  polynomial  inequalities,  finally 
putting  it  all  together  via  the  surprising  connections  with  optimization. 


2.1.1  Sums  of  Squares  and  SOS  Programs 


Our  notation  is  mostly  standard.  The  monomial  xa  associated  to  the  n- tuple  a  = 
(cq, . . .  ,an)  has  the  form  x"l...x"n,  where  a*  €  No-  The  degree  of  a  monomial 
xa  is  the  nonnegative  integer  £)"=1  ai-  A.  polynomial  is  a  finite  linear  combination 
of  monomials  Ylass  caxa,  where  the  coefficients  ca  are  real.  If  all  the  monomials 
have  the  same  degree  d,  we  will  call  the  polynomial  homogeneous  of  degree  d.  We 
denote  the  ring  of  multivariate  polynomials  with  real  coefficients  in  the  indeterminates 
{xi,...,x„}  as  77[x]. 

A  multivariate  polynomial  is  a  sum  of  squares  (SOS)  if  it  can  be  written  as  a  sum  of 
squares  of  other  polynomials,  i.e., 

p{x)  =  Y  (x)»  x )  € 

i 

If  p(x)  is  SOS  then  clearly  p{x)  >  0  for  all  x.  In  general,  SOS  decompositions  are  not 
unique.  For  example,  the  polynomial  p(x i,  x2)  =  xi  ~  x\x\  +  x2  +  1  is  SOS.  Among 
infinite  others,  it  has  the  decompositions:  p[x i,  x2)  =  f  (^i  —  x2)2  +  |(aq  +  x\)2  + 1  = 
|  (3  —  x\)2  +  §x2  +  jjfgC&Ei  —  16z2)2  +  ffx2-  The  sum  of  squares  condition  is  a  quite 
natural  sufficient  test  for  polynomial  nonnegativity.  Its  rich  mathematical  structure 
has  been  analyzed  in  detail  in  the  past,  notably  by  Reznick  and  his  coauthors  [53,  54], 
but  until  very  recently  the  computational  implications  have  not  been  fully  explored. 
In  the  last  few  years  there  have  been  some  very  interesting  new  developments  sur¬ 
rounding  sums  of  squares,  where  several  independent  approaches  have  produced  a 
wide  array  of  results  linking  foundational  questions  in  algebra  with  computational 
possibilities  arising  from  convex  optimization.  Most  of  them  employ  semidefinite  pro¬ 
gramming  (SDP)  as  the  essential  computational  tool.  For  completeness,  we  present 
in  the  next  paragraph  a  brief  summary  of  SDP. 


Semidefinite  programming  SDP  is  a  broad  generalization  of  linear  programming 
(LP),  to  the  case  of  symmetric  matrices.  Denoting  by  Sn  the  space  of  nxn  symmetric 
matrices,  the  standard  SDP  primal-dual  formulation  is: 


minx  C  •  X 


maxy  bTy, 


(  Aj»X  =  bi,  i  =  l,...,m 

\  x  y  o 


S.t.  YZ=  1  AiVi  ^  c 


(2) 


where  Ait  C,  X  6  Sn  and  b,  y  €  7 Zm.  The  matrix  inequalities  are  to  be  interpreted  in 
the  partial  order  induced  by  the  positive  semidefinite  cone,  i.e. ,  X  y  Y  means  that 
X  —  Y  is  a  positive  semidefinite  matrix.  Since  its  appearance  almost  a  decade  ago 
(related  ideas,  such  as  eigenvalue  optimization,  have  been  around  for  decades)  there 
has  been  a  true  “revolution”  in  computational  methods,  supported  by  an  astonishing 
variety  of  applications.  By  now  there  are  several  excellent  introductions  to  SDP; 
among  them  we  mention  the  well-known  work  of  Vandenberghe  and  Boyd  [55]  as  a 
wonderful  survey  of  the  basic  theory  and  initial  applications,  and  the  handbook  [56] 
for  a  comprehensive  treatment  of  the  many  aspects  of  the  subject. 


From  SDP  to  SOS  The  main  object  of  interest  in  semidefinite  programming  is 
quadratic  forms,  that  are  positive  semidefinite.  When  attempting  to  generalize  this 
construction  to  homogeneous  polynomials  of  higher  degree,  an  unsurmountable  diffi¬ 
culty  that  appears  is  the  fact  that  deciding  nonnegativity  for  quartic  or  higher  degree 
forms  is  an  NP-hard  problem.  Therefore,  a  computational  tractable  replacement  for 
this  is  the  following:  even  degree  polynomials,  that  are  sums  of  squares. 

Sum  of  squares  programs  can  then  be  defined  as  optimization  problems  over  affine 
families  of  polynomials,  subject  to  SOS  contraints.  Like  SDPs,  there  axe  several 
possible  equivalent  descriptions.  We  choose  below  a  free  variables  formulation,  to 
highlight  the  analogy  with  the  standard  SDP  dual  form  discussed  above.  A  sum  of 
squares  program  has  the  form 

maxy  biyi  +  •  •  •  +  bmym 
s.t.  Pi{x,y)  are  SOS,  i  — 

where  Pi(x,y)  :=  Ci(x)  +  An{x)yi  H - 1-  Aim(x)ym,  and  the  Ct,  Ay  are  given  poly¬ 

nomials  in  the  variables  x,. 

SOS  programs  are  very  useful,  since  they  directly  operate  with  polynomials  as  their 
basic  objects,  thus  providing  a  quite  natural  modelling  formulation  for  many  prob¬ 
lems.  Among  others,  examples  for  this  are  the  search  for  Lyapunov  functions  for 
nonlinear  systems  [27,  57],  probability  inequalities  [58],  as  well  as  the  relaxations  in 
[27,  51]  discussed  below. 

Interestingly  enough,  despite  their  apparently  greater  generality,  sum  of  squares  pro¬ 
grams  are  in  fact  equivalent  to  SDPs.  On  the  one  hand,  by  choosing  the  polynomials 


Cfix),  Aij(x)  to  be  quadratic  forms,  we  recover  standard  SDP.  On  the  other  hand,  as 
we  will  see  in  the  next  section,  it  is  possible  to  exactly  embed  every  SOS  program 
into  a  larger  SDP.  Nevertheless,  the  rich  algebraic  structure  of  SOS  programs  will 
allow  us  a  much  deeper  understanding  of  their  special  properties,  as  well  as  enable 
customized,  more  efficient  algorithms  for  their  solution  [59].  Furthermore,  as  illus¬ 
trated  in  later  sections,  there  are  numerous  questions  related  to  some  foundational 
issues  in  nonconvex  optimization  that  have  simple  and  natural  formulations  as  SOS 
programs. 


SOS  programs  as  SDPs  Sum  of  squares  programs  can  be  written  as  SDPs.  The 
reason  is  the  following  theorem:  A  polynomial  p{x)  is  SOS  if  and  only  if  p(x)  =  zTQz, 
where  z  is  a  vector  of  monomials  in  the  x,  variables,  Q  €  SN  and  Q  y  0.  In  other 
words,  every  SOS  polynomial  can  be  written  as  a  quadratic  form  in  a  set  of  monomials 
of  cardinality  N,  with  the  corresponding  matrix  being  positive  semidefinite.  The 
vector  of  monomials  z  (and  therefore  N)  in  general  depends  on  the  degree  and  sparsity 
pattern  of  p{x).  If  p(x)  has  n  variables  and  total  degree  2d,  then  z  can  always  be 
chosen  as  a  subset  of  the  set  of  monomials  of  degree  less  than  or  equal  to  d,  of 
cardinality  N  =  (n+d).  Consider  again  the  polynomial  p(x\,  X2)  —  x\  —  x\x\  +  x\  + 1. 
It  has  the  representation 


P{xi,x  2)  = 


1 

T 

'60-2  O' 

1 

1 

^2 

0  4  0  0 

6 

x\ 

-2  0  6  -3 

x\ 

Xi 

0  0-3  6 

Xi 

and  the  matrix  in  the  expression  above  is  positive  semidefinite. 


In  the  representation  f(x)  =  zTQz,  for  the  right-  and  left-hand  sides  to  be  identical, 
all  the  coefficients  of  the  corresponding  polynomials  should  be  equal.  Since  Q  is  simul¬ 
taneously  constrained  by  linear  equations  and  a  positive  semidefiniteness  condition, 
the  problem  can  be  easily  seen  to  be  directly  equivalent  to  an  SDP  feasibility  prob¬ 
lem  in  the  standard  primal  form  (2).  Given  a  SOS  program,  we  can  use  the  theorem 
above  to  construct  an  equivalent  SDP.  The  conversion  step  is  fully  algorithmic,  and 
has  been  implemented,  for  instance,  in  the  SOSTOOLS  software  package,  described 
in  the  next  section.  Therefore,  we  can  in  principle  directly  apply  all  the  available 
numerical  methods  for  SDP  to  solve  SOS  programs. 


2.1.2  Algebra  and  Optimization 


A  central  theme  throughout  convex  optimization  is  the  idea  of  infeasibility  certificates 
(for  instance,  in  LP  via  Farkas’  lemma),  or  equivalently,  theorems  of  the  alternative. 
As  we  will  see,  the  key  link  relating  algebra  and  optimization  in  this  approach  is  the 


fact  that  infeasibility  can  always  be  certified  by  a  particular  algebraic  identity,  whose 
solution  is  found  via  convex  optimization. 


Ideals  and  cones  For  later  reference,  we  define  here  two  important  algebraic  ob¬ 
jects;  the  ideal  and  the  cone  associated  with  a  set  of  polynomials:  Given  a  set  of 
multivariate  polynomials  {/i ,  -  -  - ,  /m}>  let 


ideal(/i, . . . ,  fm)  :=  {/ 1  /  =  ^Ufu  U  e  Tl[x}}. 

i=  1 

Also,  given  a  set  of  multivariate  polynomials  {gi,. . . ,  gm},  let 

cone(5i,  •  •  • ,  9m)  :=  {g  \  g  =  $o  +  ^  Mi  +  ^  s^g^j  +  ^  8^9 j9k  +  •••}> 

{*}  {til  {iJM 

where  each  term  in  the  sum  is  a  squarefree  product  of  the  polynomials  g ,,  with  a 
coefficient  sa  €  7Z[x]  that  is  a  sums  of  squares.  The  sum  is  finite,  with  a  total  of 
2m  —  1  terms,  corresponding  to  the  nonempty  subsets  of  {<71, ... ,  gm}. 

These  algebraic  objects  will  be  used  for  deriving  valid  inequalities,  which  are  logical 
consequences  of  the  given  constraints.  Notice  that  by  construction,  every  polynomial 
in  ideal (/,)  vanishes  in  the  solution  set  of  ffix)  =  0.  Similarly,  every  element  of 
cone(<7,)  is  clearly  nonnegative  on  the  feasible  set  of  gfix)  >  0.  The  notions  of  ideal 
and  cone  as  used  above  are  standard  in  real  algebraic  geometry;  see  for  instance  [60]. 
In  particular,  the  cones  are  also  referred  to  as  a  preorders.  Notice  that  as  geometric 
objects,  ideals  are  affine  sets,  and  cones  are  closed  under  convex  combinations  and 
nonnegative  scalings  (i.e.,  they  are  actually  cones  in  the  convex  geometry  sense). 
These  convexity  properties,  coupled  with  the  relationships  between  SDP  and  SOS, 
will  be  key  for  our  developments  in  the  next  section. 


Infeasibility  certificates  If  a  system  of  equations  does  not  have  solutions,  how 
do  we  prove  this  fact?  A  very  useful  concept  is  that  of  certificates ,  which  are  formal 
algebraic  identities  that  provide  irrefutable  evidence  of  the  nonexistence  of  solutions. 

We  briefly  illustrate  some  well-known  examples  below.  The  first  two  deal  with  linear 
systems  and  polynomial  equations  over  the  complex  numbers,  respectively. 


•  Range/kernel:  Ax  —  b  is  infeasible  o  3p  s.t.  ATp  —  0,  bT  p  =  —1. 

•  Hilbert’s  Nullstellensatz:  Let  fi{z), . . . ,  /m(z)  be  polynomials  in  complex  vari¬ 
ables  Z\, . . . ,  zn.  Then, 


fi(z)  =  0  (i  =  1, . . . ,  m)  is  infeasible  in  C"  <=>  —  1  G  ideal(/i , . . . ,  fm). 


Each  of  these  theorems  has  an  “easy”  direction.  For  instance,  for  the  first  case,  given 
the  multipliers  p  the  infeasibility  is  obvious,  since 


I  Ax  —  b  =>  pF  Ax  =  pTb  =»  0  =  —1, 

which  is  clearly  a  contradiction.  The  two  theorems  above  deal  only  with  the  case  of 
equations.  The  inclusion  of  inequalities  in  the  problem  formulation  poses  additional 
algebraic  challenges,  because  we  need  to  work  on  an  ordered  field.  In  other  words, 
we  need  to  take  into  account  special  properties  of  the  reals,  and  not  just  the  complex 
numbers. 


For  the  case  of  linear  inequalities,  LP  duality  provides  the  following  characterization 
(Farkas  lemma): 

Ax  +  b  —  0  .  .  r  ., ,  . 

~  _  is  infeasible  3  A  >  0,  p  s.t. 

Cx  +  d  >  0  — 

Although  not  widely  known  in  the  optimization  community  until  recently,  it  turns 
out  that  similar  certificates  do  exist  for  arbitrary  systems  of  polynomial  equations 
and  inequalities  over  the  reals.  The  result  essentially  appears  in  this  form  in  [60] ,  and 
is  due  to  Stengle  [61],  and  is  called  Positivstellensatz. 

I  ff\  Z  n  r  =  ]’ ' '  ‘  ’  is  infeasible  in  TZn 

l  9i(x)  >  0,  (t  =  l,...,p) 

$ 

f  F(x)  +  G(x)  =  -1 

3F(z),G(®)eft[s]s.t.  I  F(x)€ideal(/1,...,/m) 

[  G(x)  €  cone(gu...,gp). 


(  ATp  +  CT  A  =  0 
[  bTg.  +  cf'X  —  — 1. 


The  theorem  states  that  for  every  infeasible  system  of  polynomial  equations  and 
inequalities,  there  exists  a  simple  algebraic  identity  that  directly  certifies  the  nonexis¬ 
tence  of  real  solutions.  By  construction,  the  evaluation  of  the  polynomial  F(x)  +  G(x) 
at  any  feasible  point  should  produce  a  nonnegative  number.  However,  since  this 
expression  is  identically  equal  to  the  polynomial  —1,  we  arrive  at  a  contradiction. 
Remarkably,  the  Positivstellensatz  holds  under  no  assumptions  whatsoever  on  the 
polynomials. 

In  the  worst  case,  the  degree  of  the  infeasibility  certificates  F(x),  G(x)  could  be  high 
(of  course,  this  is  to  be  expected,  due  to  the  NP-hardness  of  the  original  question). 
In  fact,  there  are  a  few  explicit  counterexamples  where  large  degree  refutations  are 
necessary  [62],  Nevertheless,  for  many  problems  of  practical  interest,  it  is  often  the 
case  that  it  is  possible  to  prove  infeasibility  using  relatively  low-degree  certificates. 
There  is  significant  numerical  evidence  that  this  is  the  case,  as  indicated  by  the  large 
number  of  practical  applications  where  SDP  relaxations  based  on  these  techniques 
have  provided  solutions  of  very  high  quality. 


Degree  \  Field 

Complex 

Real 

Linear 

Range/ Kernel 

Farkas  Lemma 

Linear  Algebra 

Linear  Programming 

Polynomial 

Nullstellensatz 

Positivstellensatz 

Bounded  degree:  Linear  Algebra 
Groebner  bases 

Bounded  degree:  SDP 

Table  1:  Infeasibility  certificates  and  associated  computational  techniques. 


Of  course,  we  are  concerned  with  the  effective  computation  of  these  certificates.  For 
the  Positivstellensatz,  notice  that  the  cones  and  ideals  as  defined  above  are  always 
convex  sets  in  the  space  of  polynomials.  A  key  consequence  is  that  the  conditions 
in  Posivstellensatz  for  a  certificate  to  exist  are  therefore  convex,  regardless  of  any 
convexity  property  of  the  original  problem.  Even  more,  the  same  property  holds  if 
we  consider  only  bounded-degree  sections,  i.e.,  the  intersection  with  the  set  of  poly¬ 
nomials  of  degree  less  than  or  equal  to  a  given  number  D.  In  this  case,  the  conditions 
in  the  P-satz  have  exactly  the  form  of  a  SOS  program!  Of  course,  as  discussed  ear¬ 
lier,  this  implies  that  we  can  find  bounded-degree  certificates,  by  solving  semidefinite 
programs.  In  Table  1  we  present  a  summary  of  the  infeasibility  certificates  discussed, 
and  the  associated  computational  techniques. 

As  outlined  in  the  preceding  paragraphs,  there  is  a  direct  connection  going  from 
general  polynomial  optimization  problems  to  SDP,  via  P-satz  infeasibility  certificates. 
Pictorially,  we  have  the  following: 


Polynomial  systems  =>  P-satz  certificates  =$>  SOS  programs  =>  SDP 

Even  though  we  have  discussed  only  feasibility  problems,  there  are  obvious  straight¬ 
forward  connections  with  optimization.  By  considering  the  emptiness  of  the  sublevel 
sets  of  the  objective  function,  sequences  of  converging  bounds  indexed  by  certificate 
degree  can  be  directly  constructed. 


2.1.3  SOSTOOLS 

SOSTOOLS  [63,  64]  is  a  free,  third-party  MATLAB  toolbox  for  solving  sum  of  squares 
programs.  The  functions  implemented  in  SOSTOOLS  are  based  on  the  sum  of  squares 
decomposition  of  multivariate  polynomials  [65],  which  can  be  efficiently  computed 
using  semidefinite  programming  [55].  SOSTOOLS  was  the  result  of  the  recent  interest 
in  sum  of  squares  polynomials  [66,  65,  67,  27,  68,  69,  51],  partly  due  to  the  fact  that 
these  techniques  provide  convex  relaxations  for  many  computationally  hard  problems 
such  as  global,  constrained,  and  boolean  optimization  [66,  69,  51,  70,  71,  72], 

In  addition  to  the  optimization  problems  mentioned  above,  sum  of  squares  polynomi- 


als  (and  hence  SOSTOOLS)  find  applications  in  several  systems  analysis  and  control 
theory  problems,  such  as  nonlinear  stability  analysis  [27,  57,  73,  34],  robustness  anal¬ 
ysis  [27,  57,  34,  14],  nonlinear  synthesis  [74,  75],  and  model  validation  [76,  14].  Some 
other  areas  in  which  SOSTOOLS  is  applicable  are  geometric  theorem  proving  [77] 
and  quantum  physics  [29]. 

Currently,  sum  of  squares  programs  are  handled  by  reformulating  them  as  semidef- 
inite  programs  (SDPs),  which  in  turn  are  solved  efficiently,  e.g.  using  interior  point 
methods.  Several  commercial  and  non-commercial  software  packages  are  available  for 
solving  SDPs.  While  the  conversion  from  SOS  programs  to  SDPs  can  be  performed 
manually  for  small  size  instances  or  tailored  for  specific  problem  classes,  such  a  con¬ 
version  can  be  quite  cumbersome  to  perform  in  general.  It  is  therefore  desirable  to 
have  a  tool  that  automatically  performs  this  conversion  for  general  SOS  programs. 
This  is  exactly  where  SOSTOOLS  comes  to  play.  It  automates  the  conversion  from 
SOS  program  to  SDP,  calls  the  SDP  solver,  and  converts  the  SDP  solution  back  to 
the  solution  of  the  original  SOS  program.  At  present,  it  uses  another  free  MATLAB 
add-on  called  SeDuMi  [78]  as  the  SDP  solver. 

All  polynomials  in  SOSTOOLS  are  implemented  as  symbolic  objects,  making  full  use 
of  the  MATLAB  Symbolic  Math  Toolbox’s  capabilities.  This  gives  to  the  user  the 
benefit  of  being  able  to  do  all  polynomial  manipulations  using  the  usual  arithmetic 
operators:  +,  as  well  as  differentiation,  integration,  point  evaluation,  etc.  In 

addition,  this  provides  the  possibility  of  interfacing  with  the  Maple  symbolic  engine 
and  library,  which  is  advantageous. 

The  user  interface  has  been  designed  to  be  as  simple,  easy  to  use,  and  transparent 
as  possible.  A  user  creates  an  SOS  program  by  declaring  SOS  program  variables, 
adding  SOS  program  constraints,  setting  the  objective  function,  and  so  on.  After  the 
program  is  created,  the  user  calls  one  function  to  run  the  solver.  Finally,  the  user 
retrieves  solutions  to  the  SOS  program  using  another  function. 

SOSTOOLS  is  available  for  free  under  the  GNU  General  Public  License.  The  software 
and  its  user’s  manual  can  be  downloaded  from  the  SOSTOOLS  website  [79].  It 
requires  MATLAB  version  6.0  or  later,  SeDuMi  version  1.05,  and  the  Symbolic  Math 
Toolbox  version  2.1.2.  SOSTOOLS  can  be  easily  run  on  a  UNIX  workstation  or 
on  a  Windows  PC.  It  utilizes  the  MATLAB  sparse  matrix  representation  for  good 
performance  and  to  reduce  the  amount  of  memory  needed.  To  give  an  illustrative 
figure  of  the  computational  load,  all  the  demo  files  that  are  distributed  along  with 
SOSTOOLS  can  be  solved  in  less  than  8  seconds  by  SOSTOOLS  running  on  a  PC 
laptop  with  a  933  Mhz  Intel  Pentium-Ill  processor  and  384  MBytes  of  RAM. 


2.2  Stability  Analysis  Using  Sum  of  Squares  Programming 


It  is  striking  that  there  are  now  so  many  areas  of  research  in  which  the  Sum  of 
Squares  approach  and  positivstellensatz  find  application.  Formulating  the  problem 
as  the  emptiness  of  a  set  is  an  important  step,  but  this  only  involves  algebraic  in¬ 
equalities.  A  natural  question  to  ask  is  whether  we  can  use  the  aforementioned  tools 
to  answer  meaningful  questions  on  models  of  complex  biological  systems,  in  other 
words  whether  the  properties  of  a  dynamical  system  can  be  inferred  by  constructing 
these  proofs.  Such  questions  may  involve  the  functionality  of  the  system,  something 
that  simulation  alone  cannot  guarantee.  In  particular,  the  stability  of  the  equilibria 
of  a  particular  biological  system  may  be  of  interest.  The  question  therefore  is  whether 
positivstellensatz  and  the  Sum  of  Squares  decomposition  can  provide  an  alternative 
to  simulation  for  nonlinear  systems. 

This  problem  is  particularly  important  for  biological  systems,  as  the  nonlinearities 
and  their  possible  hybrid  nature  can  not  be  neglected  and  the  tools  to  date  that 
use  exhaustive  simulations  are  doomed  by  computational  complexity  as  the  state 
dimension  increases  and  the  number  of  parameters  increases.  As  an  alternative  to 
simulation,  the  concept  of  Lyapunov  function  can  be  used  as  a  certificate  for  stability 
of  the  system.  Let  us  consider  a  system  x  =  f(x)  which  has  an  equilibrium  x*,  with 
a  neighborhood  X.  Lyapunov  functions,  usually  denoted  by  V(x)  are  nothing  but 
energy-like  functions  that  have  the  following  properties: 

V(x*)  =  0,  (3) 

y(x)>0  Vs€*\{x*},  (4) 

^  =  |£(*)/(*)  <0  Vx  e  X.  (5) 

For  example,  think  of  the  pendulum  with  friction;  the  sum  of  the  kinetic  and  potential 
energy  can  serve  as  a  Lyapunov  function,  in  that  at  the  point  of  rest,  the  energy  is 
zero,  is  positive  in  any  other  configuration,  and  while  the  system  evolves  the  energy 
is  non-increasing,  due  to  dissipation  because  of  friction  at  the  hinge. 

Lyapunov  functions  are  at  the  center  of  nonlinear  systems  analysis  and  design.  When 
the  systems  under  investigation  are  uncertain,  they  can  also  be  parameterized  ap¬ 
propriately  to  serve  as  a  robust  stability  proof,  i.e.  a  proof  that  the  system  is  stable 
for  all  values  of  the  parameters.  A  more  general  class  of  systems,  such  as  systems 
with  equality,  inequality  and  Integral  Quadratic  constraints  can  also  be  tackled  by 
constructing  Lyapunov  functions  (see  the  Appendix  for  more  details). 

Let  us  turn  back  to  the  two  Lyapunov  properties,  namely  the  positive  definiteness 
condition  on  V(x),  i.e.  V{x)  >  0  and  the  negative  semidefiniteness  of  its  time  deriva¬ 
tive,  i.e.  <  0.  These  are  essentially  positivity  properties.  How  about  using 

the  Sum  of  Squares  decomposition  to  check  them?  This  idea  is  indeed  the  step  that 
opened  up  the  way  to  an  algorithmic  analysis  of  nonlinear  systems.  People  understood 


that  the  conditions  in  Lyapunov’s  stability  theorem  were  difficult  to  test  and  there 
was  no  constructive  methodology  to  construct  these  functions,  but  it  took  a  century 
to  develop  the  theory  necessary  for  the  construction  of  these  energy-like  functions 
through  the  Sum  of  Squares  decomposition  algorithmically  [27].  The  fact  that  there 
might  be  parametric  uncertainty  or  other  types  of  uncertainty  in  the  system  can  also 
be  dealt  directly;  parameterized  Lyapunov  functions  can  be  constructed  in  the  same 
unified  manner.  We  can  now  obtain  information  about  the  properties  of  the  system 
further  away  from  the  equilibrium,  that  no  linearisation  procedure  could  provide  us. 

The  construction  of  Lyapunov  functions  in  some  region  of  the  equilibrium  reveals 
estimates  of  the  ‘region  of  attraction’  of  it;  Sometimes  the  presence  of  other  equilibria 
render  any  statement  we  make  about  the  system  non-global;  this  implies  the  use  of 
inequality  constraints  in  the  state-space,  to  restrict  the  construction  of  the  Lyapunov 
function  in  some  region  around  the  equilibrium,  a  formulation  that  can  be  taken  into 
account  in  a  unified  manner.  In  fact  quite  a  lot  of  problems  in  nonlinear  dynamical 
systems  theory  can  now  be  answered  algorithmically,  such  as  hybrid,  time-delay, 
stochastic  etc. 

What  might  be  misunderstood  is  that  the  system  vector  field,  i.e.  f(x)  in  =  f(x) 
should  be  in  a  polynomial  form,  so  that  the  Sum  of  Squares  decomposition  of  poly¬ 
nomials  can  be  used  to  construct  the  corresponding  Lyapunov  function.  However  it 
is  always  possible,  through  a  series  of  changes  of  variables  and  recasting  to  put  the 
system  description  into  polynomial,  plus  a  few  ‘constraints’  that  describe  the  new 
variables.  While  most  biological  systems  have  or  can  be  approximated  by  polyno¬ 
mial  descriptions,  in  many  cases  models  have  rational  or  fractional  vector  fields;  they 
appear  naturally  with  the  use  of  Hill  Functions  to  describe  a  reaction’s  velocity  in 
the  Michaelis-Menten  sense.  In  this  case  the  denominator  of  the  system  description 
has  always  the  same  sign,  as  otherwise  one  is  faced  with  the  unrealistic  case  of  a 
system  with  a  finite  escape  time.  Therefore  the  quality  of  the  vector  field  and  its 
properties  would  not  change  if  one  multiplied  out  the  vector  field  by  its  least  common 
denominator,  thus  producing  a  system  in  polynomial  form  ready  for  analysis  using 
the  Sum  of  Squares  decomposition.  Virtually  any  problem  that  can  be  formulated 
with  a  finite  number  of  polynomial  equalities  and  inequalities  fits  in  this  framework. 
The  theory  that  we  developed  is  indeed  unified  in  this  sense,  and  in  all  that  will 
follow  all  these  concerns  will  find  solution  by  resorting  to  a  generalization  of  Lya¬ 
punov’s  stability  theorem  to  systems  with  equality,  inequality  and  Integral  Quadratic 
constraints.  Even  hybrid  systems  can  be  dealt  with  directly  in  this  same  framework, 
and  we  will  come  back  to  this  later.  Also,  the  methodology  used  to  cover  the  stiff 
dynamics  of  equations  by  IQCs  can  be  extended  to  systems  containing  nonlinearities 
and  the  resulting  system  can  still  be  analyzed  using  the  sum  of  squares  machinery. 

Similar  analysis  using  sum  of  squares  programming  has  also  been  developed  for  hybrid 
systems.  Stability  analysis  of  switched  and  hybrid  systems  has  been  treated  in  e.g.  [80, 
81,  82,  83].  See  also  [84]  for  a  recent  survey  of  the  field.  The  developments  in  this  area 
have  been  amazing,  but  again  have  been  restricted  to  the  analysis  of  linear  switching 


events  with  simple  switching  rules,  etc.  One  way  of  proving  stability  of  switched 
and  hybrid  systems  is  by  using  piecewise  quadratic  Lyapunov  functions  [80,  82,  83], 
which  are  constructed  by  concatenating  several  quadratic  Lyapunov-like  functions 
across  the  discrete  modes  of  the  system.  This  approach  is  quite  effective  but  in  many 
cases  it  can  be  conservative.  Nonetheless,  by  constructing  polynomial  and  piecewise 
polynomial  Lyapunov  functions  using  the  sum  of  squares  techniques,  the  power  of 
the  method  can  be  significantly  amplified.  The  method  generalizes  previous  analysis 
methods  using  quadratic  and  piecewise  quadratic  Lyapunov  functions.  Some  features 
of  the  new  approach  are  as  follows.  First,  stability  can  be  proven  with  a  smaller 
number  of  Lyapunov-like  functions,  eliminating  the  need  of  refining  the  state  space 
partition.  Second,  the  method  can  be  applied  to  systems  with  nonlinear  subsystems 
and  nonlinear  switching  surfaces.  Finally,  parametric  robustness  analysis  can  be 
performed  in  a  straightforward  manner. 


2.3  Formal  Methods  for  Reachability  Analysis 

It  has  been  noted  previously  that  biological  processes  are  multiscale  and  stochastic. 
Simulation  and  analysis  of  complex  models  of  this  kind  is  quite  challenging.  We 
have  also  emphasized  a  need  for  modeling  systems  at  different  levels  of  abstraction. 
These  different  abstract  models  are  represented  in  different  mathematical  formalisms. 
One  particularly  useful  formalism  for  representing  useful  abstractions,  especially  for 
purposes  of  analysis  and  simulation,  is  that  of  hybrid  systems,  either  deterministic, 
nondeterministic  (uncertain),  or  stochastic  (uncertain,  with  underlying  probabilistic 
interpretation). 

Hybrid  systems  incorporate  both  continuous-time  dynamics  and  discrete  elements. 
The  continuous  dynamics  are  given  using  time  varying  variables  through  differential 
equations.  In  the  realm  of  biological  process  modeling,  the  discrete  dynamics  can 
arise  in  at  least  three  different  ways.  First,  abstraction  and  simplification  of  a  con¬ 
tinuous  model  can  result  in  discrete  dynamics.  For  example,  systems  that  exhibit 
multiscale  dynamics  can  be  simplified  by  replacing  certain  slowly  changing  variables 
by  their  piecewise  constant  approximation.  This  is  done  when  interest  is  in  ana¬ 
lyzing  the  system  on  a  small  time  scale.  Additionally,  sigmoidal  nonlinearities  are 
commonly  observed  in  biological  data  correlation  and  the  corresponding  models  often 
use  (continuous)  sigmoidal  functions.  These  can  also  be  approximated  by  discrete 
transitions  between  piecewise-linear  regions.  A  second  source  of  discrete  behavior  is 
the  presence  of  an  inherently  discrete  process.  For  example,  dynamics  in  the  presence 
of  small  number  of  molecules  are  best  described  using  discrete  steps.  Finally,  faulty 
modes  may  also  be  modeled  using  discrete  mode  changes.  For  example,  in  normal 
conditions,  the  kidney  does  not  excrete  any  glucose,  but  it  starts  excreting  glucose 
if  the  level  of  glucose  rises  very  high.  This  change  can  be  captured  using  a  discrete 
transition. 


Uncertainties  and  stochastic  behavior  are  common  in  biology.  Rate  constants  and 
several  other  parameters  in  models  of  biological  systems  are  determined  using  algo¬ 
rithms  for  determining  minimal  error  curve  fits  for  available  data  points.  Parameter 
values  obtained  this  way  are  “representative”  values,  they  do  not  capture  all  observed 
behaviors.  The  actual  value  of  the  parameter  is  possibly  stochastic  in  a  given  range. 
In  many  cases,  we  are  interested  in  knowing  about  all  possible  behaviors  of  the  sys¬ 
tem,  rather  than  the  behavior  of  the  system  assuming  a  representative  value  for  the 
parameters.  For  example,  when  studying  the  effect  of  insulin  injections  on  blood 
glucose  concentrations,  we  wish  to  know  all  likely  blood  glucose  concentrations  that 
a  human  body  may  exhibit.  In  such  cases  uncertainties  can  be  modeled  using  nonde¬ 
terminism  and  the  resulting  model  can  be  analyzed  for  all  possible  behaviors.  Thus, 
at  suitable  abstract  levels,  biological  processes  are  effectively  modeled  as  nondeter- 
ministic,  uncertain  hybrid  systems.  These  models  can  then  be  analyzed  for  safety 
properties,  that  is,  properties  about  all  possible  behaviors  of  the  system.  In  principle, 
safety  verification  or  reachability  analysis  aims  to  show  that  starting  at  some  initial 
conditions,  a  system  cannot  evolve  to  some  unsafe  region  in  the  state  space. 


2.3.1  Techniques  Based  on  Sum  of  Squares  Programming 


Recently,  a  set  of  techniques  based  on  convex  optimization  and  sum  of  squares  meth¬ 
ods  has  also  been  proposed  for  reachability  analysis  of  nonlinear  and  hybrid  systems 
[85,  86,  87,  88].  These  techniques  verify  temporal  properties  such  as  safety  (e.g., 
something  bad  never  happens),  reachability  (e.g.,  something  good  can  happen),  and 
eventuality/liveness  (e.g.,  something  good  will  surely  happen)  using  certain  functions 
of  states  called  barrier  certificates  and  density  functions  [85,  87].  Not  only  can  they 
be  applied  to  discrete  transition  systems,  because  of  their  deductive  nature  the  tech¬ 
niques  can  be  directly  applied  to  systems  with  infinite  or  even  uncountable  number 
of  states,  such  as  continuous  and  hybrid  systems. 

For  a  simple  illustration,  consider  a  continuous  system  x  =  f(x,d)  where  x  is  the 
state  of  the  system  taking  its  value  in  the  state  space  X  and  d  is  a  disturbance  input 
taking  its  value  in  V.  In  addition,  consider  X0  C  X  as  the  set  of  possible  initial  states, 
and  Xu  c  X  as  the  set  of  unsafe  states.  Suppose  there  exists  a  barrier  certificate,  i.e. , 
a  differentiable  function  B(x)  satisfying  the  inequalities 


B(x)  <  0  Vx  e  Xo, 

(6) 

B(x)  >0  Vx  €  Xu, 

(7) 

£5  D 

-^-(x)f(x,d)  <0  Vx  E  X  x  V. 
ox 

(8) 

Then  it  is  easy  to  see  that  the  safety  property  holds,  i.e.,  that  for  all  possible  initial 
state  Xq  €  X0  and  for  all  possible  disturbance  input  there  exists  no  trajectory  of 
the  system  that  goes  from  the  initial  set  to  the  unsafe  set.  As  another  illustration, 
consider  the  same  system  but  now  assume  that  XT  C  X  is  a  set  of  “good”  states  that 


should  be  reached  by  the  system.  Assuming  that  X  is  bounded,  then  the  existence 
of  B(x)  satisfying 


B(x )  <0  Vi  €  Xo, 

(9) 

B(x)  >0  Vi  e  dX, 

(10) 

D 

—  (i)/(i,  d)  <  -e  Vi  G  {X  \  XT)  x  V, 

(11) 

where  d X  is  the  boundary  of  X  and  e  is  a  positive  number,  will  prove  the  even¬ 
tuality/liveness  property,  i.e.,  that  for  all  initial  state  in  X0  and  under  all  possible 
disturbance  the  trajectory  of  the  system  will  reach  the  target  set  Xr  in  some  finite 
time.  Systems  with  hybrid  dynamics  can  be  treated  in  an  analogous  manner.  The 
idea  here  is  simply  to  ask  that  during  the  discrete  transition  the  value  of  B(x )  also 
satisfies  certain  non-increasing  conditions,  similar  to  what  we  have  in  (8)  and  (11). 

It  is  obvious  that  simulation  is  of  limited  use  to  address  the  verification  of  safety  or 
eventuality  properties  stated  above.  Since  the  state  of  the  system  is  uncountable, 
verifying  by  simulation  that  the  properties  hold  in  all  cases  is  never  exact,  simply 
because  it  is  impossible  to  test  all  system  behaviors.  In  fact,  simulation  alone  may 
fail  to  uncover  the  existence  of  bad  behaviors.  Using  barrier  certificates  and  density 
functions  to  prove  safety,  reachability,  and  eventuality  is  analogous  to  using  Lyapunov 
functions  to  prove  stability.  It  eliminates  the  needs  to  run  simulations,  to  explicitly 
compute  the  flow  of  the  system,  or  to  propagate  sets  of  states.  Another  consequence  of 
their  deductive  nature  is  that  the  techniques  are  applicable  to  nonlinear,  uncertain, 
and  constrained  systems.  Moreover,  safety  verification  of  stochastic  systems  can 
be  handled  by  computing  an  appropriate  barrier  certificate  which  upper-bounds  the 
probability  of  reaching  the  unsafe  set  [86].  In  this  case,  a  function  B(x)  which  is  a 
supermartingale,  i.e.,  whose  evolution  along  time  is  non-increasing  on  the  average ,  is 
used. 

The  conditions  that  must  be  satisfied  by  barrier  certificates  and  density  functions 
are  formulated  as  convex  programming  problems.  In  addition  to  benefits  in  terms  of 
computation,  the  duality  structure  inherent  because  of  their  formulation  as  convex 
programs  also  gives  theoretical  advantages.  For  example,  a  completeness  statement 
in  safety  verification  using  barrier  certificates  has  been  obtained  by  exploiting  this 
duality  structure  [88].  For  continuous  and  hybrid  systems  whose  descriptions  are  in 
terms  of  polynomials,  sum  of  squares  programming  described  in  Section  2.1  provides 
a  hierarchy  of  scalable  algorithmic  methods  for  computing  barrier  certificates  and 
density  functions,  where  at  each  level  the  computational  cost  grows  polynomially 
with  respect  to  the  system  size.  The  computation  can  be  performed  efficiently  using 
semidefinite  programming,  for  example  using  the  software  SOSTOOLS.  Because  of 
the  possibility  to  use  sum  of  squares  programming  for  computing  barrier  certificates 
and  density  functions,  the  methodology  seems  to  be  more  scalable  than  many  other 
existing  methods  that  can  handle  nonlinear  continuous  and  hybrid  systems.  Successful 
application  of  the  method  for  verifying  the  safety  property  of  a  NASA  life  support 


system,  which  is  a  nonlinear  hybrid  systems  with  6  discrete  modes  and  10  continuous 
states,  has  been  reported  in  [89]. 


3  New  Results  and  Challenges 

3.1  Model/Data  Comparison  for  Validation  of  Biological  Mod¬ 
els 

The  new  sum  of  squares  based  methodology  described  in  the  preceding  sections  pro¬ 
vides  for  the  first  time  a  systematic,  scalable  approach  to  robustness  analysis  and 
model  invalidation  for  nonlinear  and  hybrid  DAE  systems,  complementing  modelling 
and  simulation  with  a  powerful  proof  infrastructure.  While  it  builds  directly  on 
decades  of  research  in  robust  control  and  dynamical  systems,  it  represents  a  true  wa¬ 
tershed  in  these  subjects.  The  models  used  previously  are  simple  enough  that  ad  hoc 
approaches  are  modestly  effective,  but  even  here  the  sum  of  squares  formalisms  are 
much  more  efficient  and  effective,  and  they  scale  to  larger  problems.  Furthermore, 
we  believe  that  all  of  the  above  methods  could  in  principle  be  taught  at  the  under¬ 
graduate  level,  potentially  streamlining  the  teaching  of  much  of  systems  theory  and 
giving  broad  access  to  powerful  tools. 


3.1.1  The  General  Form  of  Data  Collaboration:  SBPriME 


GRI-Mech  and  SBPriMEare  the  work  of  our  colleagues  Andy  Packard  and  Michael 
Frenklach,  who  are  collaborating  with  Adam  Arkin  (all  at  UC  Berkeley)  on  the  Al¬ 
liance  for  Cellular  Signalling  (AfCS)  modeling  project.  The  AfCS  is  a  large  project 
funded  by  NIH  that  is  providing  us  with  a  modern  example  of  post-genomics  biol¬ 
ogy  research,  and  as  such  has  been  a  major  driver  for  the  development  of  our  tools. 
They  are  currently  the  most  sophisticated  users  of  SOSTOOLS  in  the  biology  com¬ 
munity.  The  GRI-Mech  approach  that  has  been  adopted  for  system  identification  in 
the  AfCS  modeling  effort  puts  theory/models  and  data  on  the  same  footing.  It  does 
not  change  the  way  experimentation  is  done,  but  requires  a  different  approach  to 
analyzing  even  one’s  own  observations  and,  as  a  consequence,  places  new  standards 
on  data  reporting.  In  this  approach,  referred  to  here  as  SBPriME,  measured  data, 
its  estimated  uncertainty,  and  a  model  of  the  experimental  system  is  treated  as  an 
assertion  whose  correctness  depends  on  the  suitability  of  the  model  and  the  reliability 
of  the  measured  data.  Taken  together,  the  model  and  measurement  constitute  a  (low 
dimensional)  constraint  in  the  “global”  unknown  parameter  space.  Specifically,  only 
those  parameters  that  are  consistent  with  the  model/measurement  pair  are  possible 
values  of  the  unknown  parameters. 


Formally,  the  reportable  content  of  an  experiment  and  modeling  effort  consists  of 

1.  A  model,  M,  which  relates  input  signals,  output  signals,  initial  conditions,  mod¬ 
eled  unknown  parameters,  and  modeled  unknown  disturbances  in  an  implicit 
manner 

M(u,y,x0,Q,d)  =  0  (12) 

The  form  of  M  is  often  an  ordinary  differential  or  difference  equation,  derived 
from  a  combination  of  first-principles,  conservation  laws  and  additional  expert- 
supplied  assumptions. 

2.  A  parameter  set  0,  which  captures  the  a  priori  information  about  parameter 
values  as  9  €  © 

3.  A  disturbance  signal  set  V,  which  captures  the  a  priori  information  about  un¬ 
known  disturbances  as  d  €  V 

4.  Data,  the  measured  quantities  («data,  2/data,  £o, data) 


Together,  these  form  a  constraint  on  the  parameter  space,  that  implied  by  writing  all 
the  information  in  a  “publicly”  accessible  manner 

A/,  (^data,  2/data,  2-0, data)  ,  e  e  0,  dev.  (13) 

A  collection  of  these  constitute  a  multitude  of  assertions  about  the  joint  parameter 
space  encompassed  by  the  individual  assertions.  One  would  like  to  do  reasoning  on 
this  collection.  For  instance: 

•  Is  the  collection  consistent?  More  specifically,  is  the  set  of  unknown  parame¬ 
ters  which  satisfy  each  the  model/measurement  assertions  a  nonempty  set?  If 
not,  then  something  is  wrong  about  the  collection,  invalidating  at  least  one  of 
the  model/measurement  assertions.  Moreover,  a  proof  (experiment  list,  math 
programming  utilities)  to  illustrate  the  invalidation  exists.  As  an  example,  in 
section  3.1.3  we  illustate  a  class  of  uncertain,  nonlinear  dynamical  systems  for 
which  this  consistency  question  reduces  to  linear  programming. 

•  Can  several  consistent  assertions  be  collapsed  (i.e.,  reduced)  into  a  single  asser¬ 
tion  whose  description  is  simpler  than  simply  the  collection  from  which  it  was 
inferred? 

•  Which  model/measurement  pairs  have  the  most  impact  on  the  collection’s 
(in)consistency?  Answering  this  can  flag  assertions  that  are  possibly  incorrect, 
though  self-consistent. 

•  Which  model  assumptions  have  the  most  impact  on  the  collection’s  (in)consistency? 
We  want  to  look  for  architectures  that  are  both  consistent  and  whose  consis¬ 
tency  is  robust  to  certain  variations.  Answering  the  posed  question  will  flag 
assumptions  within  assertions  that  the  consistency  is  fragile  with  respect  to. 


•  What  is  the  tightest  range  of  predictions  about  an  additional  process  model  (i.e., 
its  quantity  of  interest)  possible  given  that  these  predictions  must  be  consistent 
with  the  collection? 

•  What  is  the  utility  of  a  hypothetical  experiment  to  further  knowledge  regarding 
the  system.  In  this  framework,  “what-if”  questions  can  be  posed  and  addressed. 


Informally,  we  refer  to  these  questions  as  Model/Data  comparison  problems.  It  is 
important  to  note  that  all  of  these  questions  lead  to  set-intersection  questions,  which 
can  be  posed  as  constrained  optimization  problems.  The  purely  mathematical  task 
of  extracting  desired  information  from  all  reported  experiments  is  relegated  to  high- 
powered,  scalable,  global  optimization  algorithms  described  in  section  2.1.  In  most 
cases,  duality  results  yield  “derivative” -like  information  as  well.  This  may  be  used, 
somewhat  heuristically,  in  order  to  quickly  screen  (from  a  long  list)  specific  model/data 
comparison  problems  to  more  fully  analyze. 


3.1.2  Connecting  with  System  Identification 


Our  approach  shares  characteristics  with  conventional  system  identification.  The 
latter  also  treats  parameterized  models  and  the  use  of  experimental  data  to  better 
characterize  model  parameters.  Identification  typically  involves  assumptions  regard¬ 
ing  the  noise  properties  of  measurements  and  disturbances,  and  optimization  of  a  cost 
function  to  decide  a  best  (maximum- likelihood  (ML),  maximum  a  posteriori  proba¬ 
bility)  parameter  value  based  on  these  noise  statistics,  the  experimental  data,  and 
an  a  priori  distribution  on  the  parameters.  In  addition,  it  computes  estimates  of 
the  variances  in  this  optimal  parameter.  Because  of  this  probabilistic  framework, 
notions  of  data/model  invalidation  are  less  crisp.  By  contrast,  our  approach  can  be 
thought  of  as  deterministic  and  worst-case,  tracking  the  feasibility  of  well-defined 
inequalities  drawn  from  models  and  data.  No  attempt  is  made  to  characterize  the 
distribution  of  the  parameters;  inferences  are  drawn  from  the  set  of  parameters  that 
are  not  invalidated  by  the  model/data  pair. 

Although  our  numerical  techniques  take  a  deterministic  approach  to  experimental  er¬ 
rors  and  parameter  uncertainty,  the  necessity  for  collaborative  data  processing  (and 
software  to  support  this)  put  forth  in  this  proposal  are  equally  relevant  in  this  tra¬ 
ditional  identification  setting.  To  properly  account  for  coupled  uncertainty  between 
multiple  experimental  analyses,  data  must  be  shared  and  reasoned  with  in  a  single 
processing  step.  The  sum-of-squares  relaxations  developed  to  solve  the  data/model 
comparison  problem  will  also  benefit  the  identification  community  by  virtue  of  im¬ 
provements  in  constrained  global  optimizations  methodologies. 


3.1.3  Specific  Model  Forms 


An  instance  of  the  model/data  comparison  problem  outlined  in  the  previous  section 
with  a  tractable  solution  follows.  Consider  a  discrete-time  model,  with  uncertain 
coefficients  whose  evolution  is  governed  by 

N 

y k  =  Pm(yk-i,  ■  ■  •  ,yk-M,Uk-i, . . .  ,uk-M)  +  dk  (14) 

m= 1 


The  polynomials  (pm}^=1  are  known.  The  parameter  vector  6  and  the  disturbance 
sequence  d  are  unknown.  A  priori  information  consists  of  linear  inequalities  for  6  as 
well  as  for  d , 


(15) 


Given  a  sequence  of  data  (y,  u)  and  initial  values  (y~i,  y_2,  • . . ,  V-m),  the  model/data 
comparison  problem  is  to  determine  if  there  exists  a  parameter  value  9  and  distur¬ 
bance  sequence  d  consistent  with  the  apriori  information  (the  linear  inequalities  in 
equation  15)  such  that  the  data  is  reproduced  by  the  model,  or  to  prove  that  no  such 
combination  exists. 


The  structure  of  the  system  in  equation  14  yields  a  validation  (and  hence  falsification) 
problem  decided  with  linear  programming.  Consequently,  all  of  the  desirable  proper¬ 
ties  of  linear  programming  (robust  solvability,  duality  theory,  favorable  computational 
growth  with  problem  size,  a  well  defined  notion  of  ill-conditioning)  translate  to  this 
model/data  comparison  problem.  A  simple  alteration  of  the  vector  field  so  that  it 
not  linear  in  the  parameters,  completely  changes  the  complexity  of  the  model/data 
comparison  problem.  Consider 

yk  =  p(yk-i,  ■  ■  ■  ,Vk-M,Uk-i,  ■  ■  ■  ,Uk-M,&)  +  dk  (16) 

The  polynomial  p  is  known.  Again,  the  parameter  vector  6  and  the  disturbance 
sequence  d  are  unknown  and  a  priori  information  consists  of  (e.g.)  the  linear  inequal¬ 
ities  in  equation  15.  This  problem  is  not  decidable  by  LP,  though  can  be  attacked 
in  a  scalable  manner  by  the  more  powerful  analysis  techniques  available  with  SOS 
programming  and  the  barrier  certificate  approach,  described  below. 


3.1.4  System  ID  challenges 

The  basic  concepts  in  systems  ID  are  well-known  but  implementation  in  the  context 
of  our  goals  for  biology  leads  to  computational  challenges  that  we  have  begun  to 
address.  There  are  several  elements  that  must  be  combined  to  adequately  treat 
these  issues.  First,  parameters  in  biological  and  engineering  models  vary  enormously 
in  their  impact  on  network-level  phenotypes.  In  particular,  circuitry  is  designed 


or  has  evolved  to  be  largely  insensitive  to  large  variations  in  many  parameters  but 
with  extreme  sensitivity  to  a  few.  This  simultaneous  coexistence  of  robustness  and 
fragility  is  typical,  and  has  tremendous  benefits,  though  it  apparently  leads  to  some 
confusion.  One  consequence  is  that  robust  parameters  (in  the  sense  that  they  can 
vary  widely  with  little  phenotype)  will  typically  vary  widely  experimentally,  and  even 
if  they  do  not,  are  intrinsically  hard  to  identify  from  input-output  experiments  on 
the  intact  network.  This  is  well-known  in  control  theory.  Fortunately,  it  is  also  the 
case  that  it  is  less  important  that  these  parameters  be  known  accurately,  compared 
to  “fragile”  parameters.  What  is  not  trivial  is  having  the  entire  toolset  of  modeling, 
analysis,  and  system  ID  work  with  uncertain  parameters  throughout  without  the 
need  to  ultimately  assign  exact  values,  since  in  any  important  sense  there  are  no 
condition  independent  “true  values”  for  these  parameters.  Thus  our  use  of  explicitly 
uncertain  models  throughout  is  crucial,  as  is  the  unique  capability  of  our  methods 
to  provide  versatile  sensitivity  and  robustness  analysis  of  stochastic,  hybrid,  and 
nonlinear  dynamical  systems. 

It  is  possible  that  important  (e.g.  fragile)  parameters  are  nonetheless  poorly  esti¬ 
mated  from  a  given  set  of  data,  for  any  number  of  reasons,  and  the  determination 
of  the  next  high  value  added  experiment  requires  several  elements.  One  is  the  iden¬ 
tification  through  local  sensitivity  or  global  robustness  analysis  that  the  range  of 
parameter  values  that  are  unfalsified  by  data  is  inadequate  for  predicting  phenotypes 
of  interest.  The  next  is  to  turn  the  system  ID  and  model  (in)validation  tools  back¬ 
wards  to  identify  new  data  that,  based  on  current  models  and  understanding,  would 
be  most  informative.  These  all  lead  to  constrained  optimization  problems  which  are 
not  convex  except  in  the  most  trivial  cases.  Our  methods  are  aimed  at  solving  these 
optimization  problems  in  a  scalable  and  automated  way,  and  the  use  of  duality-based 
methods  are  particularly  useful  in  using  the  dual  variables  to  evaluate  sensitivities 
to  primal  constraints.  We  have  extensive  experience  with  these  issues  for  engineer¬ 
ing  control  systems  and  more  recently  in  the  context  of  methane  combustion,  but 
applications  in  biology  are  new  and  largely  untried. 

A  serious  difficulty  to  be  overcome  is  that  we  currently  rely  on  control  engineers  to 
be  very  sophisticated  users  of  robust  control  and  system  ID  tools  and  combine  them 
appropriately  in  specific  applications.  While  the  individual  steps  may  be  systematic, 
the  combination  currently  requires  too  much  user  expertise,  and  thus  is  not  automated 
and  only  scalable  in  the  hands  of  experts.  Overcoming  this  is  a  major  goal  of  our 
research,  but  progress  will  require  iteration  on  biologically  motivated  problems. 

It  is  often  the  case  that  biological/chemical/physical  understanding  can  be  used  to 
identify  those  dynamical  variables  or  other  outputs  whose  measurement  will  be  most 
useful  in  model  refinement,  model  invalidation,  and  parameter  estimation.  One  ad¬ 
vantage  of  our  methods  is  that  they  allow  direct  inclusion  of  modeling  information 
from  very  heterogeneous  sources  through  the  use  of  a  rich  class  of  constraints  on  state 
variables  and  parameters.  The  major  obstacle  to  using  these  methods  are  twofold. 
One  is  the  computational  complexity  of  the  resulting  analysis,  and  addressing  this  is 


the  centerpiece  of  our  proposed  research.  The  other  is  that  even  with  sophisticated 
tools,  biologists  must  be  able  to  describe  both  their  data  and  their  other  sources  of 
understanding,  which  can  come  from  very  diverse  sources,  in  mathematical  terms  on 
which  automated  inference  can  be  performed. 

An  issue  that  can  be  a  source  of  confusion  is  that  even  in  the  linear  case  and  even  when 
data  is  generated  from  a  toy  linear  model,  the  parameters  of  that  model  cannot  be 
recovered  exactly  in  the  presence  of  noise,  except  asymptotically.  This  is  well-known, 
as  are  various  results  on  the  effect  of  the  noise  forcing  on  the  convergence,  the  effects 
of  unmodeled  dynamics,  etc.  In  the  case  where  system  ID  is  done  with  real  data,  the 
notion  of  “true  parameters”  really  makes  no  sense.  For  both  of  these  reasons,  in  our 
PrIMe  system  ID  and  model  (in)validation  framework  we  focus  not  on  finding  “true” 
parameter  values  but  describing  sets  of  unfalsified  models.  The  framework  inherits 
from  robust  control  the  ability  to  naturally  handle  unmodeled  dynamics  and  varied 
noise  models,  and  SOSTOOLS  enables  the  systematic  use  of  nonlinear  models.  These 
are  all  the  most  advanced  state-of-the-art  capabilities  available.  As  importantly,  this 
includes  the  specific  sense  that  SOS/SDP  methods  recover  as  special  cases  many  of  the 
“gold-standard”  algorithms  that  were  previously  available.  PrIMe  and  SOSTOOLS 
thus  form  the  foundation  of  our  methodology.  Nevertheless,  we  expect  they  will  need 
to  be  extended  substantially  to  scale  to  many  problems  of  biological  interest. 


3.1.5  Blending  Surrogate  Model  and  Barrier  Certificate  Methods 


The  surrogate  model  approach  is  popular  in  chemical  kinetics  modeling,  and  has 
even  been  used  successfully  in  the  past  6  months  to  invalidate  textbook  models  of 
calcium  signalling  using  AfCS  data.  An  important  step  of  the  method  is  to  replace  the 
dynamical  model  with  a  static  model  mapping  the  effect  of  the  uncertain  parameters 
to  one  specific  feature  of  the  dynamic  model’s  response,  under  the  action  of  one 
specific  input  and  initial  condition.  This  drastically  reduces  the  complexity  of  the 
falsification  step.  Rather  than  dealing  with  dynamical  models,  the  falsification  step 
is  faced  with  checking  the  emptiness  of  a  collection  of  polynomial  inequalities,  which 
is  adequately  addressed  using  sum-of-squares  optimization  and  the  positivestellensatz 
theory  address  such  inequalities. 

The  barrier  certificate  approach  described  in  Section  2.3  works  directly  with  the 
dynamical  model,  and  is  applicable  to  model  validation  of  a  rich  class  of  dynamical 
systems.  Note  that  this  problem  can  also  be  treated  as  a  reachability  analysis  problem. 
If  a  measurement  indicates  that  the  initial  state  of  the  system  x  —  f(x)  is  contained 
in  a  set  Ao,  whereas  another  measurement  indicates  that  after  some  time  the  state  is 
in  XUl  then  the  existence  of  B(x)  satisfying  (6)-(8)  will  prove  inconsistency  between 
model  and  data,  hence  invalidate  the  model.  Information  on  the  measurement  time 
can  also  be  included  in  this  analysis,  by  augmenting  the  state  of  the  system  with 
time.  Applied  to  the  special  case  in  equation  (14),  for  example,  where  LP  is  sufficient 


to  decide  inconsistency,  the  barrier  certificate  approach  will  yield  the  same  decision 
(conclusion)  for  linear  barrier  certificates  satisfying  some  appropriate  conditions. 

Since  a  polynomial  of  a  given  order  is  determinable  from  a  collection  of  its  sample 
values,  it  is  possible  to  interpret  the  barrier  function  method  (and  even  do  the  compu¬ 
tations,  likely  more  numerically  well-conditioned)  as  operating  only  with  simulated 
data.  Viewed  in  this  vein,  the  surrogate  model  approach  and  the  barrier  function 
approach  have  similar  starting  points — families  of  simulations  of  the  parametrized 
model  over  its  parameter  space.  They  differ  in  the  use  of  this  simulated  data.  One 
goal  of  this  research  for  which  we  have  had  some  limited  initial  success  is  to  create  a 
family  of  falsification  methods,  based  on  model  simulations,  that  includes  as  a  special 
case  both  the  surrogate  model  and  barrier  function  approaches. 


3.2  Extending  Methods  for  Reachability  Analysis  of  Hybrid 
Systems 

There  have  been  in  the  past  several  independent  approaches  to  the  algorithmic  anal¬ 
ysis  of  hybrid  systems,  a  challenge  in  both  technological  and  biological  networks.  The 
combination  of  continuous  and  discrete  dynamics  presents  a  challenge  for  which  very 
few  good  tools  exists.  Even  in  the  case  of  purely  continuous  behavior,  arguably  much 
simpler,  the  computational  perspectives  seemed  grim  outside  the  linear  case.  Fortu¬ 
nately,  in  the  specific  case  of  the  highly  nonlinear  hybrid  systems  arising  in  biological 
models,  the  introduction  of  the  sum  of  squares  based  methodology  has  given  renewed 
hope  and  many  concrete  examples,  impossible  to  analyzed  by  other  methods. 

Clearly,  what  will  be  needed  in  order  to  extend  the  reach  of  the  methods  from  our 
initial  examples  to  medium  and  large-scale  biologically  and  technologically  relevant 
applications,  is  an  intensive  blend  of  the  best  elements  of  all  successful  approaches. 
In  particular,  as  lessons  from  the  computer  science  perspective,  we  extract  the  fun¬ 
damental  need  of  the  use  of  hierarchies  of  abstractions  to  decouple  the  successive 
theorem-proving  stages,  as  a  way  to  deal  with  complexity.  Such  a  strategy  appears 
every  time  difficult  problems  are  tackled,  in  domains  ranging  from  pure  mathematics 
to  VLSI  design.  The  first  steps  will  be  the  simpler  case  where  the  hierarchies  are 
designed  in  an  ad-hoc  fashion,  and  the  algorithmic  tools  are  used  at  each  successive 
level  of  inference  and  theorem  proving.  Further  down  the  road,  we  imagine  to  be 
able  to  use  feedback  from  the  dual  solutions,  to  produce  likely  counterexamples  and 
suggestions  on  possibly  optimal  decompositions.  Furthermore,  we  expect  the  efficient 
exploration  of  proof  space  performed  by  the  convex  optimization  algorithms  in  the 
SOS  case  is  a  novel  concept  for  the  theorem  proving  community,  and  one  that  we 
expect  to  be  of  benefit  in  some  of  the  central  tasks  in  other  domains. 


3.3  Analysis  of  Spatially  Distributed  Dynamics 


The  algorithmic  analysis  of  spatially  distributed  systems  has  been  in  the  center  of 
scientific  research  for  many  years  now  [90,  91,  92].  Most  techniques  that  are  employed 
to  understand  their  properties  center  on  discretization  of  the  describing  Partial  Differ¬ 
ential  Equation  (PDE)  and  performing  simulations  on  the  resulting  finite  dimensional 
description.  This  technique  is  particularly  useful  when  the  domain  of  definition  of 
the  PDE  is  complicated.  Our  effort  on  this  program  will  be  twofold;  first,  we  will 
develop  a  methodology  to  obtain  estimates  on  functional  outputs  of  the  PDE,  that 
can  guide  the  choice  of  an  adequate  mesh  size  so  that  the  simulation  output  will 
be  representative  of  the  PDE  description.  Secondly,  we  will  consider  common  PDE 
systems  on  simple  domains,  aiming  to  answer  analysis  questions  about  them  such  as 
stability,  model  verification  etc.  The  basis  for  the  algorithmic  technique  is  the  sum 
of  squares  decomposition  and  SOSTOOLS. 

The  first  issue  we  have  tackled  is  related  to  the  problem  of  estimating  lower  bounds 
on  functional  outputs  of  PDEs  [93] .  Such  a-priori  information  is  extremely  useful  for 
choosing  the  size  of  the  discretisation  required  to  capture  the  problem  features  in  a 
CFD  simulation.  Mesh  adaptivity  and  modern  a-posteriori  methods  may  not  perform 
well  in  certain  cases,  and  this  is  why  a-priori  methods  are  required.  Using  the  results 
of  model  invalidation,  there  are  two  methods  that  can  be  applied  to  obtain  such 
bounds.  In  the  first  one,  model  invalidation  can  be  applied  to  an  equivalent  system 
that  includes  the  output  functional  as  one  of  the  states;  invalidating  values  that  this 
new  state  can  achieve  provides  estimates  for  the  values  of  the  output  functional. 
The  other  method,  which  is  essentially  the  dual  of  the  first  one,  can  be  applied  in 
the  elliptic  and  parabolic  case  where  the  maximum  principle  applies;  the  solution  is 
approximated  by  a  polynomial  and  a  series  of  upper  and  lower  bounds  can  be  achieved 
by  solving  an  appropriately  formulated  sum  of  squares  problem.  The  differences 
between  the  two  methods  is  that  the  dual  method  can  be  used  for  PDEs  in  many 
dimensions  but  it  cannot  handle  nonlinear  functional  outputs  or  other  types  of  PDEs 
where  the  maximum  principle  does  not  apply,  whereas  the  first  approach  can  be 
applied  in  these  cases,  but  is  only  for  systems  in  one  dimension.  With  an  appropriate 
extension  of  the  notion  of  barrier  certificates  to  the  notion  of  barrier  functionals,  we 
expect  to  be  able  to  extend  this  to  PDEs  in  many  dimensions. 

The  second  issue  we  have  begun  to  address  is  related  to  analysis  questions  such  as 
stability,  model  invalidation,  etc.  The  standard  analysis  tools  center  on  the  con¬ 
struction  of  appropriate  Lyapunov-type  certificates.  Several  issues  arise,  such  as  the 
choice  of  the  norm  in  which  the  analysis  should  be  performed,  but  now  the  analy¬ 
sis  can  be  done  entirely  algorithmically.  Crucial  steps  in  any  analysis  procedure  for 
PDEs  involve  integration  by  parts  of  the  candidate  Lyapunov  functional,  establishing 
functional  positivity,  etc.,  for  which  we  have  well  developed  techniques.  The  difficul¬ 
ties  involve  choosing  candidate  structures  and  infinite-dimensionality  of  the  spaces. 
We  have  most  progress  on  infinite  dimensional  systems  of  a  particular  type:  time- 


delay  systems.  In  this  case,  the  so-called  Lyapunov-Krasovskii  functionals  can  been 
constructed  algorithmically  to  prove  stability  of  the  equilibria  [94].  Also,  analysis  of 
network  congestion  control  schemes  for  the  Internet,  for  arbitrary  topologies,  delays 
and  link  capacities  has  been  successfully  tackled  [95,  96].  Moreover,  results  in  this 
area  have  also  been  applied  to  nonlinear  stability  analysis  of  systems  ranging  from 
population  dynamics  to  economics. 
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